Used as: Division One - Level Three:
| Value | 1000 |
| Submission Rate | 5 / 541 (0.92%) |
| Success Rate | 1 / 5 (20.00%) |
| High Score | Petr for 446.30 points (46 mins 8 secs) |
| Average Score | 446.30 (for 1 correct submission) |
It should be evident that the constraints are much larger than TheCowDivTwo. In cases with these large constraints in counting problems in which a fast formula is not easily visible, it helps to try to modify the problem into something simpler, or at least different. In fact, the solution that will be described next, begins by taking a steps that may not initially appear to help but ultimately do allow us to find a suitable solution.
The problem asks us to count sets. Sets are orderless structures. In the TheCowDivTwo version of the problem, we kept an order 0,1,… N-1 in which we added the elements to the set. That is because we did not want to count [0,1,3] and [0,3,1] as different sets. So we use a fixed order when adding elements to a set if the objective is to count sets. This fixed order made the solution for the TheCowDivTwo problem easier to find. But actually, requiring a fixed order may sometimes complicate things. What is true is that if we could get rid of the fixed order requirement, we could do it. So, instead of counting sets, let us count sequences. But the sequences will still forbid repeated elements. So, for example, [1,2,3], [3,2,1], [2,1,3] are all sequences that we will count. But the problem asks us for sets, so, after finding the number of sequences without repetitions, we will need to somehow use the number to get the number of sets. This one is simple: For set [1,2,3] there are 6 sequences: [1,2,3], [1,3,2], [2,1,3], [2,3,1], [3,1,2], [3,2,1] - The sequences are all permutations of the set. The sequences we will find will have k elements exactly, this means that for each set, there will be k! sequences. If we know the total number of sequences that do not have repetitions, we can divide it by k! and find the total number of sets.
Let us focus on counting the number of sequences without repetition of K numbers 0,1,…N-1 such that their sum is a multiple of N. Imagine that we already have a sequence of K-1 numbers. Their sum is S=X1 + X2 + … + Xk-1. We want to find a Xk such that:
S + Xk = (0 mod N)
Note that we are not really interested in finding Xk, we just want to count the number of them. In this case, there is exactly one number modulo N that will make the sum work. Assume that we know the total number of sequences of different (K-1) elements. Then we can claim that after appending a specific Xk to each of those sequences, we will form a sequence that has a sum equal to 0 modulo N. This is the moment we note that the change to sequences was very useful, because else we would need to check if the unique Xk is greater than every of the previous values that form S. There is still a possibility of conflict: What if, the unique Xk solution was already in use as one of the previous elements (X1, … Xk-1) ? This means that of the sequences of K elements we found, there may be some of them that have one repeated value. We can assume there is exactly one repetition, because we counted the number of sequences of (k-1) elements that did not have repetitions and we added an element to each of them, so there is at most one repetition. To the result we have right now, we just need to subtract the number of sequences that have one repetition. What we have at the moment is this formula:
A = B - C
A : Number of sequences of K elements with no repetitions with a sum equal to 0 modulo N.
B : Number of sequences of K-1 elements with no repetitions.
C : Number of sequences of K elements with exactly one repetitions and a sum equal to 0 modulo N.
In order to calculate C note that instead of picking K numbers, we can pick K-1 different numbers and simply say that one of them needs to be multiplied by 2 when calculating the sum. A rephrase is then :
C : Number of sequences of K-1 numbers such that the sum: X1 + X2 + … + 2 * Xi + … Xk-1 is a multiple of N. And i is any index from 1 to K-1. We can divide this problem and consider each value of i. Then count:
C1 : Number of sequences of K-1 numbers such that the sum: 2*X1 + X2 + … + Xk-1 is a multiple of N.
C2 : Number of sequences of K-1 numbers such that the sum: X1 + 2*X2 + … + Xk-1 is a multiple of N.
…
Ck-1 : Number of sequences of K-1 numbers such that the sum: X1 + X2 + … + 2*Xk-1 is a multiple of N.
By symmetry, we can tell that the amounts of each of those kinds of sequences are all equal. So, we can calculate C1 or Ck-1 and then multiply the found value by (K-1) and find the value of C.
The sub-problems we need solutions for are all very similar to the main one. Problem Ck-1 requires us to count sequences of a length k (in this case, equal to the original K minus 1) such that: (X
1+X2…+2*Xk) is 0 modulo N. We can try to generalize it into the number of sequences of k elements such that (X1+X2…+a*Xk) is 0 modulo d. Note that we are parametrizing
both number of times the last element is repeated and the modulo. To use the generalization to solve problem Ck-1 simply use a=2 and d=N
Problem A then can be seen as counting sequences of length k such that (X1+X2…+a*Xk) is 0 modulo N. In fact, we can see that if we use the generalization above, we can claim that a=1 (multiply Xkby 1) and d = N (The sum must be 0 modulo N).
Problem B only needs the elements in the sequence of length k to be different, so (X1+X2…+Xk) may be any non-negative number. All non-negative integers are multiples of 1. So we can claim that for this sub-problem:
a=1 and d=1.
The sub-problem
We intend to make a function F(d,k,a) that solves all the sub-problems described above. The function returns the total amount of sequences of k unique elements such that:
(X1 + X2 + … + a * Xk) = 0 modulo d
In fact, this function would be enough to solve the main problem: F(N,k,1) will yield the count of sequences we need. This will in fact be similar to what we did to what we initially thought was the main problem. Assume we already knew the k-1 first
elements: (X1 , X2, …, Xk-1) (all of which are unique). Their sum is: S = (X1 + X2 + Xk-1).
We need to follow this condition:
S + a * Xk = 0 modulo d.
Once again, we do not need to find the actual value of Xk, only to count it. There is a difference:
S + a * Xk = 0 modulo d.
a * Xk = -S modulo d.
We can claim that Xk can be one of 0,1,…d-1 . Then (-S modulo d) must be equal to one of: (a * 0) modulo d, (a * 1) modulo d, … (a * (d-1)) modulo d. A known aspect of modular arithmetic is that, if g=gcd(a,d) then (a*0, a*1, a*2, … ) are equal to (0*g, 1*g, 2*g, … , (d/g-1)*g) and this cycle is repeated g times. -S has to be equal to 0*g, 1*g, … or (d/g-1)*g modulo d. In other words, -S needs to be a multiple of g. Which is equivalent to S being a multiple of g. S must be 0 modulo g. Finally, we need to count the number of Xk values. Note that if -S is equal to ((t*g) modulo d), t is one of (0,1,…d/g-1), the cycle is repeated g times. So there are g different numbers modulo d that will give the wanted result. We are however, not interested in number modulo d, we are interested in numbers 0,1,…,N-1. We can, as a precondition, assume that N will always be a multiple of d, so we can tell that the N numbers will be divided in N/d portions of d integers. So, for each result modulo d, there will be N/d results from 0 to N-1.
The amount of elements Xk we can add to the sequences of k-1 elements is then: g * (N/d). This means that for every sequence of k-1 unique integers such that their sum (X1 + … + Xk-1) is a multiple of g=gcd(a,d)),
there are g*(N/d) possible elements we can add to generate a sequence of k elements such that: (X1 + … + Xk-1 + a * Xk) is 0 modulo d. However, Xk may be equal to one of X1,X2, … Xk. The current result is to multiply: F(g, k-1, 1) * g * (N/d) . That will yield a number of sequences of k elements such that: (X1 + … + Xk-1 + a * Xk) is 0 modulo dwith the downfall that exactly one Xi, i<k may be equal to Xk. We need to subtract the invalid sequences. We can consider them to be sequences of exactly k-1 unique elements such that : (X1 + … + (a+1)* Xi + … + Xk-1) is 0 modulo d. Which translates to (K-1)*F(d, k-1, a+1).
The result is a recurrence:
F(d,k,a) = F( g, k-1, 1) * g * (N/d) - F(d, k-1, a+1)
Where g = gcd(d,a). A base case for this recurrence can be: k = 0. In this situation, there is only one possible sequence, the empty sequence. The sum of the elements in this sequence is always 0. Even if a was greater than 1. 0 is always a multiple of d. So, the base case is: If k=0, return 1. The pseudo-code for the recurrence is:
1
2
3
4
5
6
7
F(d, k, a) {
if (k == 0) {
return 1
}
g = gcd(a, d)
return F(g, k - 1, 1) * g * (N / d) - F(d, k - 1, a + 1)
}
Note that the solution to the main problem is ( F(N,K,1) / Factorial(K) ). Every new argument used as d comes from calculating the gcd between d and another number. Then every new argument d found will be a divisor of the original N. This was the precondition we used above. If we want to implement this recurrence, we may use memoization so that each state is evaluated exactly once, because the recurrence is acyclic (k always decreases). d can be as large as N, but is actually always a divisor of N. There are at most sqrt(N) divisors of N. k starts as K and will decrease until it reaches 0. a begins as 1 and will increase. It does not increase indefinitely, we can tell that it only increases when k decreases, so a is currently O(K). The total amount of states (assuming memoization) is therefore O( sqrt(N) * K * K ). sqrt(N) is at most ~= 31622, K is at most 1000. So, it may appear that the recurrence is too slow for the time limit.
In reality though, the recurrence is much faster than it seems. The largest part of the complexity is currently the sqrt(N). In reality, the arguments provided as d are just divisors of N, they are always generated as a gcd(a,d). Initially, a is 1, so gcd(N,1) will yield 1. a will increase to be at most 1000. This means that in the worst case, we will have gcd(1000,N) = 1000. With the exception of the initial value of d=N , all the other values will be smaller than or equal to 1000. This reduces the bound to O( K3) which is still large for K<= 1000.
There is an optimization we can make, we are letting a grow indefinitely (until it reaches k, actually) this can be reduced. We are always counting sequences such that (X1 + … + a * Xk) is 0 modulo d. Note that since we are operating modulo d, any value of a larger than d-1 is the same as (a modulo d) . We can claim that a will always be smaller than d. To ensure this, we just need to change the (a+1) in the recurrence into (a+1)%d. By considering that a must now always be smaller than d it can be concluded that this will at least halve the amount of operations. In reality, it optimizes the solution even more. We can see that for every divisor d of N which <= 1000, there will be exactly d values of a. The complexity is then (sum of divisors of N that are <=1000) * k. We can assume that the sum of those divisors will be less than 100000. Even if all numbers from 1 to 1000 were divisors of the number, their sum would be 1000*1001/2. = 500500. This is an exaggerated value. When multiplied by the 1000 possibilities of K, the result is 500500000. Note however that some large values of K are not allowed when a is also large. When considering that the sum of divisors is probably much smaller than 500500 and that multiplying 1000 for K is also an exaggeration. We can expect the bound of 500500000 to be much larger than the real execution time. This bound can be decreased more. There is also a way to optimize the solution by finding that F(N,K,1) = ( F(gcd(N,K), K, 1) / (N/(gcd(N,K)) ), they will be left as an exercise to the readers. The following C++ implementation of the solution takes 0.3 seconds in the worst case:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
const int MOD = 1000000007;
struct TheCowDivOne {
//Euclid's algorithm for the gcd.
int gcd(int a, int b) {
while (b != 0) {
int c = a;
a = b;
b = c % b;
}
return a;
}
// Finds x raised to the y-th power
// modulo m
int modPow(int x, int y, int m) {
long long r = 1, a = x;
while (y > 0) {
if (y % 2 == 1) {
r = (r * a) % m;
}
a = (a * a) % m;
y /= 2;
}
return (int) r;
}
// Finds the modular multiplicative inverse of a, modulo m
// Assuming m is prime, it uses Fermat's little theorem
int inverse(int a, int m) {
return modPow(a, m - 2, m);
}
int N;
map < vector < int > , int > mem;
int F(int d, int k, int a) {
if (k == 0) {
return 1;
}
// memoization works by using
// a vector<int> as key for the map.
vector < int > vec(3);
vec[0] = d;
vec[1] = k;
vec[2] = a;
if (mem.find(vec) != mem.end()) {
return mem[vec];
}
int g = gcd(a, d);
// We need to calculate:
// F(g,k-1,1)*N*g/d - F(d, k-1, (a+1)%d )
long long p = F(g, k - 1, 1);
p = (p * (((long long) N * g) / d)) % MOD;
long long q = F(d, k - 1, (a + 1) % d);
q = (q * (k - 1)) % MOD;
int x = (int)((p - q + MOD) % MOD);
// memoize the solution:
mem[vec] = x;
return x;
}
int find(int N, int K) {
this - > N = N;
long long res = F(N, K, 1);
// After finding the number of sequences of
// unique integers such that their sum is 0 mod N.
// We still need to divide by the factorial of K.
// x / K! = x*(K^-1)*(K-1)^-1* ....
// finding modular multiplicative inverses.
for (int i = 2; i <= K; i++) {
res = (res * inverse(i, MOD)) % MOD;
}
return (int) res;
}
};
Exercises:
Prove that F(N, K, 1) = F(gcd(N, K), K, 1) / (N / (gcd(N,K))).
Prove that the complexity of calculation F(gcd(N, K), K, 1) is O(K^1.5) or, more exactly, O(K * number_of_divisors(K)).